Comparison: Conference Proceedings from ICALT in 2011 and 2012

This is an un-interpreted and automatically-generated report to show the variation in terms used in the full text of papers from two conferences: ICALT 2011 and ICALT 2012. A separate subsection shows those terms that appear more frequently in papers from each conference in turn. Selection criteria based on statistical significance are used to determine which terms are selected; the probability that the difference in frequency might be due to pure chance must be less than 0.1% in addition to other criteria to select dominant terms (see "technicalities").

All plots will open in a new window/tab as 1000x1000 pixel images if clicked on. The "Wordle" is 1024x768.

Overview of Selected Terms

Only middle-frequency words are considered; the comparison is between terms that are neither very common nor very rare in the aggregate of all papers being analysed.

Word cloud of compared terms
Word cloud for both sets combined and before considering difference in occurrence between the two sets of papers: word sizes indicate frequency.

 

ICALT 2011

Frequency Plot

This plot shows those terms with a statistically-significant higher frequency in ICALT 2011 papers.

ICALT 2011 Frequencies and Significance
Frequency = the fraction of terms
Significance = -log10 of the probability that the difference in frequency between the conferences is "pure chance" (i.e. 3 is 1 in 1,000, 4 is 1 in 10,000 etc)
Docs/1000 = the number of documents in the set that contain the term per thousand (colour code and area of square)
Also available as hi-res pdf.

 

Term Co-occurrence Graph

This plot shows the extent to which pairs of the higher frequency terms occur together in the same paper.

ICALT 2011 Term Co-occurrence
Node size = relative significance
Node colour is accoring to grouping
Edge (connector) size = number of documents containing both connected terms.
NB: only edges in the top 25% are shown

 

ICALT 2012

Frequency Plot

This plot shows those terms with a statistically-significant higher frequency in ICALT 2012 papers.

ICALT 2012 Frequencies and Significance
Frequency = the fraction of terms
Significance = -log10 of the probability that the difference in frequency between the conferences is "pure chance" (i.e. 3 is 1 in 1,000, 4 is 1 in 10,000 etc)
Docs/1000 = the number of documents in the set that contain the term per thousand (colour code and area of square)
Also available as hi-res pdf.

 

Term Co-occurrence Graph

This plot shows the extent to which pairs of the higher frequency terms occur together in the same paper.

ICALT 2012 Term Co-occurrence
Node size = relative significance
Node colour is accoring to grouping
Edge (connector) size = number of documents containing both connected terms.
NB: only edges in the top 25% are shown

 

Information

Source Code, Data and Technicalities

Source code for processing and formatting is available on GitHub.

Raw results are available in pairs, one of each kind being the data behind the two sections above. Gephi files are available separately for ICALT 2011 and ICALT 2012. All are under the same licence terms as this report.

The log file contains run parameters.

The technicalities of the method and explanatory notes on the content of the above downloads may be found on the GitHub wiki. These notes explain the term-selection criteria.

Copyright, Licence and Credits

This work was undertaken as part of the TEL-Map Project; TEL-Map is a support and coordination action within EC IST FP7 Technology Enhanced Learning.

Creative Commons Licence This work, its images and original text are ©2012 Adam Cooper, Institute for Educational Cybernetics, University of Bolton, UK.
Adam Cooper has licenced it under a Creative Commons Attribution 3.0 Unported License